Multi-Document Arabic Summarization Using Text Clustering to Reduce Redundancy

نویسندگان

  • Samer Abdulateef Waheeb
  • Husniza Husni
چکیده

“The process of multi-document summarization is producing a single summary of a collection of related documents. In this work we focus on generic extractive Arabic multi-document summarizers. We also describe the cluster approach for multi-document summarization. The problem with multi-document text summarization is redundancy of sentences, and thus, redundancy must be eliminated to ensure coherence, and improve readability. Hence, we set out the main objective as to examine multi-document summarization salient information for text Arabic summarization task with noisy and redundancy information. Finally, the final summary results for the ten categories of related documents are evaluated using Recall and Precision with the best Recall achieved is 0.6 and Precision is 0.6.”

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Proposed Textual Graph Based Model for Arabic Multi-document Summarization

Text summarization task is still an active area of research in natural language preprocessing. Several methods that have been proposed in the literature to solve this task have presented mixed success. However, such methods developed in a multi-document Arabic text summarization are based on extractive summary and none of them is oriented to abstractive summary. This is due to the challenges of...

متن کامل

User-Focused Multi-Document Summarization with Paragraph Clustering and Sentence-Type Filtering

Applying document clustering techniques to multidocument summarization is a challenging problem, mostly because of the redundancy that exists in multiple sources. We compare several document clustering techniques for multi-document summarization in the NTCIR-4 TSC test collection. We conducted an experiment to evaluate the effectiveness of reducing redundancy in the production of summaries. Fro...

متن کامل

A multi-document summarization system based on statistics and linguistic treatment

The massive quantity of data available today in the Internet has reached such a huge volume that it has become humanly unfeasible to efficiently sieve useful information from it. One solution to this problem is offered by using text summarization techniques. Text summarization, the process of automatically creating a shorter version of one or more text documents, is an important way of finding ...

متن کامل

Automatic Multi-Document Arabic Text Summarization Using Clustering and Keyphrase Extraction

Automatic text summarization has become important due to the rapid growth of information texts since it is very difficult for human beings to manually summarize large documents of texts. A full understanding of the document is essential to form an ideal summary. However, achieving full understanding is either difficult or impossible for computers. Therefore, selecting important sentences from t...

متن کامل

Using a Double Clustering Approach to Build Extractive Multi-document Summaries

This paper presents a method for extractive multi-document summarization that explores a two-phase clustering approach. First, sentences are clustered by similarity, and one sentence per cluster is selected, to reduce redundancy. Then, in order to group them according to topics, those sentences are clustered considering the collection of keywords that represent the topics in the set of texts. E...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014